Search CORE

18 research outputs found

Accelerated Event-by-Event Neutrino Oscillation Reweighting with Matter Effects on a GPU

Author: A C Kaboth
D Payne
Khronos OpenCL Working Group
N. Whitehead
NVIDIA Corporation
OpenMP Architecture Review Board
P. Pomorski
R G Calland
R. Wendell
Publication venue: 'IOP Publishing'
Publication date: 29/11/2013
Field of study

Oscillation probability calculations are becoming increasingly CPU intensive in modern neutrino oscillation analyses. The independency of reweighting individual events in a Monte Carlo sample lends itself to parallel implementation on a Graphics Processing Unit. The library "Prob3++" was ported to the GPU using the CUDA C API, allowing for large scale parallelized calculations of neutrino oscillation probabilities through matter of constant density, decreasing the execution time by a factor of 75, when compared to performance on a single CPU.Comment: Final Update: Post submission update Updated version: quantified the difference in event rates for binned and event-by-event reweighting with a typical binning scheme. Improved formatting of reference

arXiv.org e-Print Archive

Crossref

Royal Holloway - Pure

GPU Concurrency: Weak Behaviours and Programming Assumptions

Author: AMD.
AMD.
AMD.
Cederman D.
Cederman D.
Collier W.
Core ARM.
Feng W.
Hower D. R.
Hwu W.-m. W.
Khronos OpenCL Working Group
Sanders J.
Sorensen T.
Southern AMD.
Stuart J. A.
Weaver D. L.
Xiao S.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 10/11/2014
Field of study

Concurrency is pervasive and perplexing, particularly on graphics processing units (GPUs). Current specifications of languages and hardware are inconclusive; thus programmers often rely on folklore assumptions when writing software. To remedy this state of affairs, we conducted a large empirical study of the concurrent behaviour of deployed GPUs. Armed with litmus tests (i.e. short concurrent programs), we questioned the assumptions in programming guides and vendor documentation about the guarantees provided by hardware. We developed a tool to generate thousands of litmus tests and run them under stressful workloads. We observed a litany of previously elusive weak behaviours, and exposed folklore beliefs about GPU programming---often supported by official tutorials---as false. As a way forward, we propose a model of Nvidia GPU hardware, which correctly models every behaviour witnessed in our experiments. The model is a variant of SPARC Relaxed Memory Order (RMO), structured following the GPU concurrency hierarchy

Crossref

Oxford University Research Archive

Kent Academic Repository

Spiral - Imperial College Digital Repository